Skip to content

refactor: Add Activity-based tracing for module execution (Phase 1)#1466

Merged
thomhurst merged 1 commit into
mainfrom
fix/issue-1461-activity-pattern
Dec 30, 2025
Merged

refactor: Add Activity-based tracing for module execution (Phase 1)#1466
thomhurst merged 1 commit into
mainfrom
fix/issue-1461-activity-pattern

Conversation

@thomhurst
Copy link
Copy Markdown
Owner

Summary

  • Add ModuleActivityTracing.cs with ActivitySource for distributed tracing of module execution
  • Update ModuleRunner to wrap module execution in an Activity scope
  • Record success, skip, and failure status on the Activity with appropriate tags

This is Phase 1 (foundation, non-breaking) of the AsyncLocal refactor described in issue #1461. The Activity-based tracing runs alongside the existing AsyncLocal pattern for full backward compatibility.

Details

New file: src/ModularPipelines/Tracing/ModuleActivityTracing.cs

  • Creates an ActivitySource named "ModularPipelines.Modules" for module execution spans
  • Provides helper methods to start activities and record status (success/skip/failure)
  • Defines semantic tag keys following OpenTelemetry conventions

Modified: src/ModularPipelines/Engine/Execution/ModuleRunner.cs

  • Wraps ExecuteModuleWithPipeline in an Activity scope using the new ModuleActivityTracing
  • Records appropriate status based on module execution outcome
  • Properly handles exception recording before re-throwing

Benefits

  • Enables integration with OpenTelemetry and other APM tools
  • Provides foundation for future phases to migrate ambient context from AsyncLocal to Activity
  • Non-breaking change - existing AsyncLocal pattern continues to work

Test plan

  • Build succeeds with no new errors
  • Existing tests continue to pass
  • Activity tracing can be verified with OpenTelemetry listener (manual testing)

Fixes #1461

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown
Contributor

Summary

This PR adds Activity-based distributed tracing infrastructure for module execution as Phase 1 of migrating from AsyncLocal ambient context to Activity-based patterns.

Critical Issues

None found ✅

Suggestions

1. Activity Status Placement Could Cause Resource Leak

The Activity status recording logic is inside the try block at lines 162-172 in ModuleRunner.cs. If an exception occurs during status recording (though unlikely), the finally block won't execute, potentially preventing AsyncLocal cleanup.

Recommendation: Move the Activity status recording to the finally block to ensure it always runs and doesn't interfere with the critical AsyncLocal cleanup:

try
{
    ModuleLogger.Values.Value = logger;
    ModuleLogger.CurrentModuleType.Value = moduleType;
    await ExecuteModuleLifecycle(moduleState, scopedServiceProvider, pipelineContext, executionContext, moduleContext, cancellationToken).ConfigureAwait(false);
}
catch (Exception ex)
{
    // Record failure on the Activity before re-throwing
    ModuleActivityTracing.RecordFailure(activity, ex);
    throw;
}
finally
{
    // Record success/skip status if no exception
    if (executionContext.Status == Enums.Status.Skipped)
    {
        ModuleActivityTracing.RecordSkipped(activity);
    }
    else if (activity?.Status != ActivityStatusCode.Error)
    {
        ModuleActivityTracing.RecordSuccess(activity);
    }
    
    // Clear AsyncLocal to prevent potential leaks
    ModuleLogger.Values.Value = null;
    ModuleLogger.CurrentModuleType.Value = null;
}

2. Minor: Consider Using ActivityKind.Consumer

In ModuleActivityTracing.cs line 63, the Activity is created with ActivityKind.Internal. Since modules represent units of work within the pipeline, ActivityKind.Consumer might be more semantically accurate as modules "consume" work from the pipeline orchestrator. However, Internal is also valid and this is a minor semantic preference.

Verdict

APPROVE - No critical issues

This is a solid foundation for Phase 1. The implementation correctly:

  • Runs alongside existing AsyncLocal for backward compatibility
  • Follows OpenTelemetry semantic conventions
  • Provides proper exception handling and resource disposal
  • Maintains all existing behavior

The suggestion above is optional but would make the code more resilient to edge cases during the transition period.

Add System.Diagnostics.Activity-based tracing alongside existing AsyncLocal
pattern for backward compatibility. This enables integration with OpenTelemetry
and other APM tools.

Changes:
- Add ModuleActivityTracing.cs with ActivitySource for module execution
- Update ModuleRunner to wrap execution in Activity scope
- Record success/skip/failure status on Activity with appropriate tags

This is Phase 1 (foundation, non-breaking) of the AsyncLocal refactor.
Fixes #1461

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Summary

This PR adds Activity-based distributed tracing infrastructure for module execution as Phase 1 of migrating from AsyncLocal ambient context to Activity-based patterns.

Critical Issues

None found ✅

Suggestions

The existing github-actions review raises a good point about exception handling, but the suggested refactoring is actually unnecessary. The current implementation is correct:

  1. The finally block will always execute regardless of whether the status recording throws an exception (because it's still within the try-catch-finally structure)
  2. The AsyncLocal cleanup is guaranteed to run
  3. The Activity disposal via using happens after the entire try-catch-finally completes

The current code structure is actually the right pattern - status recording happens in the "happy path" of the try block, and failure recording happens in the catch before re-throw. This is cleaner than checking activity status in the finally block.

Minor: Consider adding XML docs to public constants

The public tag constants in ModuleActivityTracing.cs (lines 20-42) could benefit from XML documentation comments explaining their purpose and expected values, especially since they follow OpenTelemetry semantic conventions.

Previous Review Status

The github-actions bot suggested moving Activity status recording to the finally block, but this is unnecessary given that the finally block will always execute regardless. The current structure is correct and follows standard try-catch-finally patterns.

Verdict

APPROVE - No critical issues

This is a well-structured Phase 1 implementation that:

  • Correctly integrates Activity tracing alongside AsyncLocal for backward compatibility
  • Follows proper exception handling and resource disposal patterns
  • Uses appropriate OpenTelemetry semantic conventions
  • Maintains all existing behavior without breaking changes

@thomhurst thomhurst merged commit 5fae929 into main Dec 30, 2025
11 of 12 checks passed
@thomhurst thomhurst deleted the fix/issue-1461-activity-pattern branch December 30, 2025 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

REFACTOR: Replace static AsyncLocal ambient logging context with Activity-based pattern

1 participant